Skip to content

Conversation

@Jasper-Harvey0
Copy link
Collaborator

Hello! I have been thinking about the possibility of moving our test logs over to json from the csv that we currently have (that isn't really csv).

I thought I would just hack something together pretty quick and put the idea out there for everyone to comment on. I know there are some stirrings of getting a logging database going, and I think moving over to a json format could help things in that regard as its a lot easier to parse than the csv.

Some notes about the current implementation:

  • Not all the data gets populated. Fields like part number, date, and probably some others that I am forgetting do not get written to the log. That is just because I wanted to get the bones down for people to comment on.
  • I took a little bit of inspiration from the format of the OpenHTF logs, with some minor changes, most notably openHTF has tests as keys in a dictionary, so you can't have multiple tests with the same name / repeated tests cannot be logged. I changed this to just store tests in a dictionary.

You can run the tiny.py example to see what the log looks like. Currently the log just dumps to C:\Users\username\AppData\Local\Fixate\Logs
Example:
test_log_20250523_124550.json

@clint-lawrence
Copy link
Collaborator

I don't have particularly strong feelings one way or another. Do you think it would be feasible to write a script to take logs in the existing csv format and covert to this json format? If not, have two completely different log formats to import into a future log server makes things more complicated, not less.

One thing I notice missing is the fine grained timestamps. Perhaps you've just not captured it, but it would be good to include the operator waiting time.

@josh-marshall-amp
Copy link
Contributor

Back in Jan 2022 I was looking at the existing online log viewer, and had a chat with Clint about this general area.

Didn't go ahead with doing it, but I toyed with the idea of moving it all to structured logging, as I've previously migrated core business excel sheets to webapps with Django. The way I approach that is all about the import script... run it a thousand times until everything imports cleanly (bc the data is always messy) and then you know your structure has everything required.

So I had a quick play with Django and sqlite-utils as a way. Notes from that time period are here... it's terse but if you end up doing something about this, hit me up:

# Meeting with Clint re: CPE web app
- cpeapps
- amptest log viewer
  - import progression logs
  - import part names
- jira scraper
- 

## Existing log viewer

https://stackoverflow.com/questions/10525725/which-nosql-database-should-i-use-for-logging

MongoDB Capped Collections are extremely popular and suitable for logging, with the added bonus of being 'schema less', which is usually a semantic fit for logging. Often we only know what we want to log well into a project, or after certain issues have been found in production. Relational databases or strict schemas tend to be difficult to change in these cases, and attempts to make them 'flexible' tends just to make them 'slow' and difficult to use or understand.


AmptestLogIndex
  https://github.com/Ampcontrol-CPE/AmptestLogTool/blob/main/AmptestLogIndex/management/commands/populateparts.py
  MongoDB "collection" is... a table?

  parser = AmptestLogParserWeb.argParserBuilder()  # buils arguments
  AmptestLogParserWeb.parseToMongo(args)  # parse logs
  command = 'rsync -azP "/media/share/Production/Test Programs/Amptest Logs/" "/home/ubuntu/Amptest-Web-App/Amptest Logs"'
              # command = 'robocopy "G:\Production\Test Programs\Amptest Logs" "C:\Amptest Logs" /NOCOPY /E /copy:DAT /Z /MT:32 /log+:amptestlogcopy.log"'

/Volumes/Groups/Production/Test Programs/Amptest Logs

Django, JSONField and sqlite: supported as of Python 3.9
    https://code.djangoproject.com/wiki/JSON1Extension
    https://docs.djangoproject.com/en/5.0/ref/models/fields/#django.db.models.JSONField
    https://docs.djangoproject.com/en/5.0/topics/db/queries/#querying-jsonfield


log_import: import_time, file_modification_time, file_hash
log_sequence: log_import
log_line: import_id

    id = models.ObjectIdField(db_column="_id")
    File_Name = models.CharField(max_length=256)
    Date = models.CharField(max_length=128)
    Part_Number = models.IntegerField()
    Serial_Number = models.BigIntegerField()
    Test_Name = models.TextField()
    Script_Name = models.TextField()
    Script_Version = models.CharField(max_length=128)
    Fixate_Version = models.CharField(max_length=128)
    Report_Format = models.IntegerField()
    Index_String = models.TextField()
    Driver_Serials = models.TextField()
    Result = models.CharField(max_length=128)
    Runtime = models.FloatField()
    Runtime_WO_User = models.FloatField()
    Repeat_Tests = models.BooleanField()
    Full_Path = models.CharField(max_length=512)
    RAW = models.TextField()


Update 6th January 2022: sqlite-utils 3.20 introduced a new sqlite-utils insert ... --lines option for importing raw lines, so you can now achieve this without using jq at all. See Inserting unstructured data with --lines and --text for details.

Now I had a SQLite table with a single column, line. Next step: parse that nasty log format.

https://www.debuggex.com/

Now I had a SQLite table with a single column, line. Next step: parse that nasty log format.

To my surprise I couldn’t find an existing Python library for parsing key=value key2="quoted value" log lines. Instead I had to figure out a regular expression:

([^\s=]+)=(?:"(.*?)"|(\S+))

https://sqlite-utils.datasette.io/en/stable/cli.html#cli-insert-convert

sqlite-utils insert logs.db loglines access.log --convert '
type, source, _, verb, path, _, status, _ = line.split()
return {
    "type": type,
    "source": source,
    "verb": verb,
    "path": path,
    "status": status,
}' --lines

https://ampcontrol.atlassian.net/wiki/spaces/CPEK/pages/121176103/Log+Format

idea... save raw line into json array, then migrate it to proper format

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants